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Abstract 

In this manuscript we present a comparative study about the determination of the relax- 
ation {i.e., independence) time scales obtained from the correlation function, the mutual 
information, and a criterion based on the evaluation of a nonextensive generalisation of 
mutual entropy. Our results show that, for systems with a small degree of complexity, 
standard mutual information and the criterion based on its nonextensive generalisation 
provide the same scale, whereas for systems with a higher complex dynamics the standard 
mutual information presents a time scale consistently smaller. 

1 Introduction 

The description of the degree of dependence between variables is of capital importance, 
namely in several applications like time series analysis in which it is valuable to define 
how long there is a relevant relation between its elements. As examples, we mention: i) 
the determination of time scales from which value on a system is considered to be in a 
stationary state, i.e., 

dP{z{t),t) _ 

dt ' ^ ' 

or, in other words, the time needed for a system to achieve such a state {z (t) represents 
an element of a time series Z = {z {t)} at time t), ii) the existence of ageing phenomena, 
i.e., the dependence of the correlation function, 

r. (+ ^ (^^) ^ l^"- + ^)) - (^"')) (^"^ + ^)) (n\ 

(^z {tw,^) = . , — , [2) 



^ {z it^f) - {z {t^)f^{z {t^ + rf) - {z {t^ + T)y 



on the waiting time, t^. Hi) the appraisal of how good the forecasting of future events 
can be or even how long we can produce reliable predictions based on previous values 
of the time series, iv) embedment of time series used in state space reconstruction and 
independent component analysis [21 [3l HI |5] , among many other cases. 

The most straightforward way of performing this assessment has been the evaluation 
of the correlation function. Even though it has widespread applications, in truth, for a 
large class of processes, specifically complex systems 0, such a procedure is unable to give 
a proper answer [5J . Explicitly, the correlation function is a normalised covariance that is 
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complex system has been consensually defined as a system whose behaviour crucially depends on 
its details [T]. 
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only effective at determining the dependences which are either hnear or can be written in 
a hnear way. Hence, by simply applying Cz r), the dependences that do not fit in the 
linear classification can not be correctly measured. It is worth stressing that non-linear 
dependences rule a large part of the systems presently studied [6]. In other words, if 
we aim to characterise this sort of systems we must look at higher-order correlations to 
check for statistical independence. In order to make it, the correlation function is many 
times replaced by the computation of the mutual information which is able to detect the 
existence of non-linearities in the system [HI El [TJ El [9] . In the present work, we carry out 
a comparative study between the correlation function, the mutual information (based on 
the KuUback-Leibler entropy [10]) and a generalised measure of mutual information [TT] 
which emerged from a non-additive entropy [12] and that has broadly been applied [TH] . 
The comparisons presented are made in discrete time series that correspond to the large 
majority of the time series available for analysis. The results show that when the degree 
of complexity is small the two mutual information measures studied provide the same 
answer. However, if we augment the complexity of the signal, then we verify that the two 
non-linear measures give different results. 



2 Theoretical preliminaries 

Consider Shannon entropy, 5, as the average of the surprise, Sj, associated with a system 
which has a certain probability distribution {p'^\ 

Suppose now that the system is modified or new measurements are made giving rise to a 
new set, {pi}, of distributions associated with the several states allowed by the system. 
From this new set, and for each state i, we can define a new value for the surprise, 
Sj = In ^, and its variation, 

Asi = s--Si. (4) 
Averaging Asj with respect to distribution {pi} we have, 

H{Pt} dp'i}) = ^P^^Si = ^P^'^'^^^ (5) 

i i Pi 

which is the Kullback-Leibler entropy. This measure has well-known properties such 
as positiveness and concaveness among many others [Hj. Moreover, contrarily to S, 
I ({p} , {p'}) is invariant under a change of variables x — x = / (x), and is not symmetric 
when we swap pi and p'^. The latter property invalidates the possibility that / ({p} , {p'}) 
can be considered a metric distance. Nevertheless, we can still use it as a distance measure 
in probability space. 

Let us now consider that instead of using the surprise as we have just defined, we have 
a g- surprise, 

sf^ = \^,K (6) 
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where, 

1 — x^~'^ 

(when q 1, liiq X = Inx) [15\. Therefore, the variation of the g-surprise is 

1 - g 

Computing the g-average of As['^^ with respect to the distribution {pi} 



Ep 



A.<"] ^ ^ |p.|' A.S" = Ip.]- I"-! 'IZ^ (9) 



and using Eq. ([7]), we obtain the q-generalisation of Kullback-Leibler entropy, 

K,{{p},{p'}) = -Y.P^,\n,Pl, (10) 

for which Ki {{p} , {p'}) = / {{p} , {p'})- Entropy ii'g {{p} , {p'}) is positive for q > 0, neg- 
ative for g < 0, and null for g = or p'^ = pi (Vj, g). It is also provable that Kg {{p} , {p'}) 
is concave for g > and convex for g < (other properties can be found in Ref. [17j). 

Kg as a measure of dependence - We shall now consider a bidimensional 

variable z = {x,y) for which we want to quantify the degree of dependence between x and 
y. In the application of Kg to the analysis of the scale of dependence, the most plausible 
distribution to be considered as the reference distribution is the product of the marginal 
distributions, 

p' {x,y) = pi{x)p2{y) , (11) 

where 

Pi{x) = T.y Pix,y) ^2) 

P2 (y) = T^xPi^^y) 

and p {x, y) is the joint probability distribution. 

Using Eq. ([5]) we can verify that the Kullback-Leibler entropy, which for this case is 
named as mutual information, can be written as, 

I{x,y) = S (x) + S (y) - S {x,y) , 

S{x)-S{x\y), (13) 
S{y)-S{y\x). 

From the first equation it is simple to see that / {x, y) only becomes equal to zero when the 
variables x and y are independent, i.e., p{x,y) = pi {x)p2{y). Both of S (x) and S (y) 
refer to the entropies of the respective marginal distributions and the entropy S {x, y) 
renders the entropy of the join distribution. Entropies like S {x\y) are computed as 

S {x\y) = - ^p{x,y) \np {x\y) = ~Ep(^^^y) [Inp {x\y)] , (14) 
x,y 
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in which Ejj [Y] represents the average of Y associated with distribution 11. 

Considering Eq. (fTTIl . the g-generahsation, Kg {x,y), which is now called generalised 
mutual information, can be expressed as. 



Kg(x,y) = {1 - [p^ {x)p2{y)t"} - {l - [p{x,y)t'^} , (15) 



or, 



Kg ix,y) = -El^^ [lug pi {x) + In, (?/) + (1 - q) In, pi {x) In, p2 (y) - In, p {x, y)] . 

(16) 

Writing, 

p{x,y) = pi{x)p{y\x) , (17) 
and after some algebra, it is then possible to write Eq. ( fT6|) as, 

Kg {x, y) = -El^^ [In, pi {x) - In, p {y\x) - {I - q) (In, pi {x) In, p {y\x) - In, pi {x) In, {y))] 

(18) 

From Eqs. ffTUj) and ffT^ . it is possible to determine the maximum and the minimum 
values of Kg (x, y). The minimum value of Kg (x, y) = 0, exactly corresponds to the case 
in which p {x, y) = p^ (x) p2 (y). Complementary, the maximum value occurs when there 
is a bi-univocal dependence between the two variables, i.e., the maximum distance to 
independence. In this case, the conditional entropy, 

Sf-\y)) = J2[p{x\y)r\ngp{x\y), (19) 

y 

must vanish since the uncertainty of having a value x given y is absent. Analytically, this 
implies, 

K(^,y) K p iy\^)] = Ki^,y) K pi (^) Kp2 (y)] = o. (20) 

This means that the maximum of Kg {x, y) yields, 

{x, y) ^ [ln,pi (x) + (1 - g) ln,pi (x) ln,p2 (y)] • (21) 

The existence of upper and lower bounds allows us to define a ratio, i?,, 

R^ = J^ e[0'l]' (22) 

that defines the degree of dependence between the two variables x and y. For every case, 
there exists an optimal entropic index, g"^, which is related to the degree of dependence, 
such that the gradient of i?, is more sensitive and therefore more capable of determining 
small variations in the degree of dependence. In other words, q°^ is recognised as the 
inflexion point of Rg versus q curves. Regarding q"^ values, it is simple to verify that 
when X and y are independent i?, = (Vg>o) and optimal value is equal to infinity, 
q°p = oo. In the case of bi-univocal dependence, we have Rg = 1 (Vg>o), which implies in 
the limit of total dependence that q°^ = 0. Thence, for a certain finite and positive value 
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of g"^, it is valid to ascribe a given degree of dependence between the variables x and y 
that we are analysing . 

To conclude this part let us briefly discuss a very specific case of correlation in which 
the system tends to deflect from its past behaviour, anti-correlation. Anti-correlation is 
easily verified in the space of variables since the covariance provides to this case a negative 
value yielding (iw,'^) < 0. In the probability space, i.e., when information measures 
are used, negative values cannot be obtained (at least when g > 0). In this case, it has 
been observed that value of q"^ presented by a anti-correlated time series is smaller than 
the value presented by the same time series after shuffling [18]. Therefore q"^ < q°(^fiuf fied) 
can be taken cLS cL SI gnature of anticorrelation. 

3 Application to time series analysis 

In what follows, we are going to apply the mutual information measures described here- 
inabove to time series obtained from mathematical models and a heuristic time series as 
well. Our goal is to determine the time scale, T, at which each method considers the 
elements of a time series x (t) and y (t) = x {t + r) as independent from each other. Since 
we are dealing with finite time series some of the analytical results we have previously pre- 
sented are no longer valid. For example, the value of total independence that is measured 
from a finite time series is not q°^ = oo, but some finite value of q°^ instead. However, for 
a specific time series, the level of independence can be assessed by shuffling its elements 
in such a way that the existent dependencies are wiped out. The scale of interest, Tx, is 
achieved when (r) reaches the value of q°^ of a independent shuffled series [I9] . The 
same shuffling procedure allows us to determine the noise level of the correlation function 
and the minimum value of the mutual information /. The minimum concurs (within error 
margins) to the mutual information of a shuffled series and this match give us the respec- 
tive time scale of interest Tj. The linear correlation scale of independence, Tc, is obtained 
from the intersection of the correlation function with the noise level, similarly to what is 
currently made in the recurrence plot analysis technique [20] . For sake of simplicity, we 
are going to consider processes in stationary state whose results are independent of the 
waiting time. 

3.1 Logistic map in the fully chaotic regime 

Consider the following non-linear dissipative map. 



which corresponds to the logistic map in the fully chaotic regime. Equation ( l23i) is proba- 
bly the most studied non-linear dynamical system [21] • Elements of a time series obtained 
from iterating Eq. (1^ are associated with the probability distribution. 



Xt+l — 



l-2xf 



X E [-1,1], 



(23) 




(24) 
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with -S (|, I) being the Beta function. Furthermore, it can be shown that the summation, 
= Sill approaches the Gaussian distribution as ^ cxo [22]. 
As expected, when we have analysed the autocorrelation function, we have verified 
that Cx [t) promptly attains the noise level. As a matter of fact, we can write (r) = 
(Vr > 1), with higher-order correlations different from zero as shown by Beck in [23]. 
Computing mutual information I between time series elements x (t) and x{t + T), we 
have verified that the noise level is obtained for a lag T/ = 15. From the normalisation 
of the generalised mutual information measure, Rq (r) , and for each value of r, we have 
computed the optimal values q"^. Comparing the values that were obtained from the 
logistic map time series and the values obtained from the same time series after shuffling 
their elements we have verified that the characteristic time scale, for which the condition 
of independence between variables prevails, is Tk = 15. This scale is exactly the same 
time scale indicated by the standard mutual information procedure. In Fig. [H we show 
typical curves of Rg (r) for several values of r. Each curve has been obtained from averages 
over different runs (for specific values see caption in Fig. [2]). From the maximum of every 
curve (right panel of Fig. [T]) we have computed (r) exhibited in Fig. [21 




Figure 1: Left: Normalised generalisation of mutual information Rq of {xt,Xt+T) vs. q for 
series obtained from Eq. ( |23ll for several values of r and for series obtained after shuffling 
the elements from logistic map sequences. Right: Derivative of the curves in the left 
panel with respect to q vs. q. The maxima correspond to the inflexion points of R{q), 
q°P, which are represented in Fig. [21 

3.2 Autoregressive conditional heteroskedastic process 

Many time series obtained from measurements in complex systems have shown the peculiar 
feature of having (long-lasting) correlations in the magnitude of its elements albeit their 
autocorrelation points to a white noise like behaviour. Thus, the characterisation and 
modelling of the evolution of instantaneous variance, at, is of capital importance when 
we aim to study that type of dynamics. To mimic this kind of time series, it has been 
introduced by Engle the autoregressive conditional heteroskedastic process (ARCH) 
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Here, we present an extreme case of a generalisation that can be enclosed within the 
FIARCH class ES] I- Our variable is defined as, 

Xt = atUJt, (25) 

where Ut is a stochastic variable usually associated with a Gaussian distribution with 
null mean and unitary variance. The variable at, also named as volatility for historical 
reasons, is defined as 

t-i 

cj2 = a + /C (z - t + 1) x^i, (26) 

i=to 

where 

(t'<0,r>0) (27) 

with E {t') being the normalisation. This process originates non-Gaussian xt uncorrelated 
variables. Despite the latter property, the autocorrelation function of xf (so as at or \xt\) 
presents an exponential decay. From numerical implementation of Eqs. (|25l) - (!27j) with 
a = 0.5, b = 0.99635 (obtained for the case of price fiuctuations studied in Ref. [Mj), and 
q = 10, we have obtained a set of time series from which our results have been derived. 
To assure that the elements of the analysed time series are in the stationary state, we 
have left each numerical implementation run unrecorded for 10^ steps. As awaited, the 
correlation function of at presents an exponential decay which intersects the noise level at 
T = Tc = 772. From our measurements of the standard mutual information we have found 
a larger value of the independence time which corresponds to a minimum at Tj = 1093. 
Regarding the application of the criterion, we have obtained an even larger time to 
bear out independence between variables, Tk ~ 1500. This value is clearly apart from 
Tf. Looking at the dashed (green) line in the lower panel of Fig. [3] we see that the value 
of (Tj) is below the noise level (even considering error margins). According to this 
criterion, this discrepancy points that at time scale there is still a certain degree of 
dependence between variables at- 

3.3 Fluctuations of atmospheric temperature 

Fluctuations of atmospheric temperature have been intensively studied and a paradig- 
matic case of time series analysis. In the next case, we analyse fiuctuations of the daily 
temperature with respect to the regularised temperature in Rio de Janeiro (Brazil) be- 
tween the 1^* of January 1995 and the of 13*^ of January 2008 in a total of 4635 obser- 
vations [27]. Specifically, from the original time series we have obtained the regularised 
temperature according to a standard procedure used in climatology. The fiuctuations 
have then been computed by finding the difference between the measured temperature 
and the regularised temperature (see Fig. H] left panel). Computing the PDF of these 
fiuctuations we have found that they are very well described by a Gaussian as shown in 
Fig. m right panel. In defiance of such a Gaussian behaviour, when we have estimated 

^FIARCH stands for Fractionally Integrated ARCH 



/C(t') 



exp 
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the independence scale, we have verified that the dynamics is in fact governed by long- 
memory effects 0. Numerically, we have noticed that the correlation function comes into 
the noise level for Tc = 35 days. Using the standard mutual information we have ob- 
tained a minimum value for Tj = 61 days which indicates the existence of non-linearities 
governing the dynamics of temperature fluctuations. Nevertheless, the scale given by T/ 
appears to be an intermediate one, like it has happened in the previous example, since 
from the application of the we have obtained a larger upper bound for dependence 
Tk ~ 91 days. We plot these results in Fig. [5l Again, we verify a hierarchical structure 
of independence scales furnished by the correlation function, mutual information, and 
generalised mutual information. This level of dependence might be related to the fact 
that Rio de Janeiro is a onshore city, thus it is affected by the stability provided at larger 
scales from the absorption or release of heat by the sea. 

4 Final Remarks 

In this manuscript we have performed a comparative study between correlation and de- 
pendence measures, namely the mutual information measure and a generalised mutual 
information, deflned within the context of non-additive entropy Sq, aiming to obtain the 
respective independence scale between elements. Our analysis has been performed on 
discrete time dynamical systems with different levels of non-linearity and memory. Ex- 
plicitly, we have analysed the logistic map, a heteroskedastic process with long-lasting 
memory and a natural time series namely the fluctuations of atmospheric temperature. 
In the overall, our results have conveyed the well-known capability of mutual information 
for determining the presence of non-linearities. In addition, by means of increasing the 
memory of the system, i.e., soaring the level of complexity, the differences between the 
scale provided by mutual information, /, and by the criterion based on q°P come out with 
Tj being consistently smaller than Tx- Hence, the comparison of the results given by 
different information measures can be a helpful tool in order to opt for the most appro- 
priate way to model the dynamics related to the measurements of a certain observable 
or to have an estimative about how further a forecast procedure can go maintaining a 
sufficient level of reliability. We would also like to refer that the work we have detailed 
points out the relevance of generalised information measures like as it has been shown 
with the application of the generalised-escort Tsallis entropy [29] on the distinction of 
pre-ital, ictal, and post-ictal stages of epileptic signals [30] . Last of all, we refer that this 
criterion to set down the independence scale can also be used as an alternative method in 
the determination of the independence scale and subsequent evaluation of the embedding 
dimension of recurrence maps [TRl [2U] . 
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Figure 2: Upper left: Correlation function of the logistic map versus lag, Xt+i = 1 — 2x1. 
The correlation function is at the noise level for r > 1. Upper Right: Mutual information 
of the logistic map (points) and mutual information of logistic map shuffled series (line) . 
The matching occurs at the scale Tj = 15. Lower: Optimal index versus lag. The points 
have been obtained from logistic map time series, the line represents noise level of q°^, 
and the grey lines represent the upper and lower bounds of error margins. Once more, 
the match happens at the scale Tk = 15. For every case, averages over time series of 10^ 
elements are made. 
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Figure 3: Upper left: Correlation function of a long-ranged lieteroskedastic process, Eq. 
I26l ^26j with qm = I [in log-linear scale]. The correlation function is at the noise level 
for T > Tc = 772 (dotted green vertical line). The dashed blue line has a slope 400^^. 
Upper right: Mutual information of the same process (black line) and mutual information 
of logistic map shuffled series (red line). The matching occurs at Tj = 1093. Lower: 
Optimal index versus lag. The points have been obtained from logistic map time series 
and the red line represents noise level of q°^. The matching happens at Tk ~ 1500 clearly 
different from T/. For every case, averages over time series of 10^ elements are made after 
letting the process evolve for 10^ time steps to guarantee stationarity. 
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Figure 4: Left: Evolution of the atmospheric temperature (black line), the regularised 
temperature (red line) and the fluctuation, /, between measured and regularised tem- 
peratures (green line) at Rio de Janeiro between the 1** of January 1995 and the 13*^ of 
January 2008 (temperatures in Fahrenheit degrees). Right: Probability density function 
P (x) vs. X where x represents the detrended and normalised (by its standard deviation) 
temperature fluctuations, (/) = 0.15 and aj = 3.6. As can be seen, P (x) is very well 
fitted by a Normal distribution with the error of adjustment being = 2.4 x 10~^ and 
= 0.989. 
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Figure 5: Upper left: Correlation function of the temperature fluctuations presented in 
Fig. m The correlation function attains the noise level at r = Tc = 35 days. Upper right: 
Mutual information of the same series (black line) and mutual information of shuffled series 
(red line). The matching occurs at T/ = 61 days. Lower: Optimal index versus lag. The 
points have been obtained from the time series and red line represents the noise level of 
q°P. The equalisation happens at ~ 90 days again plainly away from Tj. 
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